Storing Rich Text
i've been researching a bit about rich text editors and different approaches to
storing rich text. It's a surprisingly complicated question so I thought i would
post what i'm finding.
Discourse uses (and expected all users to use) Markdown, also giving a preview
panel while you are writing posts. The post editor came with a toolbar, that
(similar to what Discord does while I'm writing this post) provides shortcuts
that help the user use Markdown syntax. The Discourse API gives you both the
original Markdown as well as the 'cooked' HTML which I guess it serialises when
the post is saved or published. Sometimes HTML tags were also included in the
Markdown if you used a custom plugin.
As a user experience, i think even with toolbars that provide shortcuts etc,
expecting users to learn and write (or even look at) Markdown is probably the
biggest issue with Discourse, particularly in terms of writing long or
particularly media-rich posts. Markdown is very simple to store though, and
benefits from being more agnostic to presentation style
For editing long-form content like blog posts, I guess the most popular paradigm
is WYSIWYG rich text editing like you get in Microsoft Word, Wordpress etc.
In total i've looked into the following open source options for rich text
editors:
-
CKEditor
-
Quill
-
Editor.js
-
Draft
-
Slate
-
ProseMirror
basically my main findings are:
-
the main options for storing rich text are HTML, Markdown and JSON. HTML is
generated by most older WYSIWYG editors such as CKEditor. Storing text as JSON
is preferred by a lot of people for lots of reasons, but I guess the big
picture reason is that it provides an abstraction for content that can be
resilient and useful across a variety of contexts, whereas HTML could be
limiting
-
storing rich text as JSON basically means that the text is stored as a tree of
nested node objects. I first encountered the concept with Editor.js which is
marketed as a block editor library and this appealed to me because I’ve really
enjoyed using Notion’s editor (block based with drag and drop) and
Squarespace. We wouldn’t need anything so advanced but having the freedom to
define custom media blocks is a massive plus for making it easy to integrate
lots of media types. I think it’s also going to be good for accessibility. I
just realised last night that editor.js doesn’t store inline styles as JSON,
just blocks. Anything inside the block is styled using HTML. This is compared
to other libraries like Slate which store all content as JSON, blocks and
inline nodes. An advantage of this is, I think, is that it makes it easier to
implement Operational Transfer, which is a system for tracking and diffing
simultaneous edits, which enables real time collaboration to happen